DATA CLEANING

PICKLE THE DATAFRAME

perplexity is a measurement of how well a probability distribution or probability model predicts a sample. It may be used to compare probability models. A low perplexity indicates the probability distribution is good at predicting the sample.

The coherence score is used in assessing the quality of the learned topics, the closer to 1 the better

Topic 1 -->> Problem with dealers and services Topic 2 -->> Topic 3 -->> Love driving the car Topic 4 -->> Car performance and quality Topic 5 -->> Speed and fuel consumption Topic 6 -->> how comfortable the car seat Topic 7 -->> The problem with maintenance and warranty Topic 8 -->> Problems with mileage and cost

Topic 1 -->> Problem with dealers and services Topic 2 -->> Topic 3 -->> Love driving the car Topic 4 -->> Car performance and quality Topic 5 -->>Speed and fuel consumption Topic 6 -->> how comfortable the car seat Topic 7 -->> The problem with maintenance and warranty Topic 8 -->> Problems with mmileage and cost